Mt Xia: Technical Consulting Group

Business Continuity / Disaster Recovery / High Availability
Data Center Automation / Audit Response / Audit Compliance

Additional documents of interest

  • Successful Business Continuity - Part 1 - Users and Groups
    This article was published in the April 2005 issue of AIX Update magazine and discusses system administration needs and requirements oriented around users and groups. The overall emphasis of this series of articles is for implementation of enterprise wide unique identifiers for a variety of parameters, such as user names, group names, UID and GID numbers.
  • Successful Business Continuity - Part 2 - Machine and Host Names
    This article was published in the May 2005 issue of AIX Update magazine and discusses naming structures for machines, systems, adapters, and aliases. The overall emphasis of this series of articles is for implementation of enterprise wide unique identifiers for a variety of parameters.
  • Successful Business Continuity - Part 3 - Volume Names
    This article was published in the December 2005 issue of AIX Update magazine and discusses naming structures for volume groups, logical volumes, log logical volumes, directory mount points, etc. The overall emphasis of this series of articles is for implementation of enterprise wide unique identifiers for a variety of parameters.
  • Successful Business Continuity - Part 4 - MQ Series, Startup/Shutdown Scripts, Error Processing
    This article was published in the April 2006 issue of AIX Update magazine and discusses how to implement AIX in an environment dedicated to business continuity. The topic of this article is the assignment of MQ Series queue names and aliases, resource group startup and shutdown script names (Application startup/shutdown script names), error logging, and error notification.
  • Successful Business Continuity - Part 5 - Miscellaneous topics
    This article was published in the August 2006 issue of AIX Update magazine and discusses how to implement AIX in an environment dedicated to business continuity. A variety of topics is discussed in this article including automated documentation generation and management.
  • Automated Microcode Management System
    One of the most difficult administration tasks in an AIX environment is attempting to keep the firmware and microcode up-to-date. Mt Xia has devised an automated method of gathering the Microcode information, determining which microcode needs to be updated, generating reports, and uploading the required microcode updates to each individual system.
  • Calculating the size of a Virtual Processor
    This document describes the algorithms used to calculate the size of a virtual processor when using shared processors in an LPAR. The IBM documentation describes how to calculate CPU utilization, NOT how to size for configuration, this document clarifies this process. A description of the HMC input fields for the processor tab is included.
  • Basics of Partition Load Manager Setup
    This presentation was provided by Ron Barker from IBM regarding the PLM Basic setup.
  • ppt
  • pdf
  • The Partition Load Manager (PLM) provides CPU and memory resource management and monitoring across logical partitions (LPARs). Partition Load Manager allows you to effectively use CPU and Memory resources by allowing you to set thresholds for designated resources. When a threshold is exceeded, Partition Load Manager can try to assign CPU and/or Memory resources to that LPAR by using resources assigned to other LPARs that are not being used.

    PLM is an automated mechanism for utilizing the Dynamic LPAR (DLPAR) capabilities of the HMC and requires communication with the HMC. This means that before PLM will function, DLPAR must be functional on the HMC. DLPAR requires communication with each LPAR via the Resource Monitoring and Control (RMC) subsystem.


    Preparation for implementation of PLM

    Install and configure SSL and OpenSSH.

    Verify or install the following fileset on the PLM Server and every PLM client LPAR::

    • csm.client

    After installation of the "csm.client" file, run the following commands to initialize the RMC subsystem:

    cd /usr/sbin/rsct/install/bin
    ./recfgct
    lssrc -a | grep rsct
    

    From the above "lssrc" output, check to ensure "IBM.CSMAgentRM" is running. Repeat these steps on every PLM client LPAR.


    Before implementing this rest of this procedure, verify the HMC is able to perform DLPAR functions to the client LPAR, then continue. If the HMC is unable to perform a DLPAR, PLM will not work.

    Implementing PLM

    Install the following filesets:

    • plm.license
    • plm.server.rte
    • plm.sysmgt.websm

    For setup of PLM, create .rhosts files on the server and all clients. After PLM has been set up, you can delete the .rhosts files.


    Create SSH keys

    On the PLM server, enter:

    ssh-keygen -t rsa
    

    Copy the HMC secure keys to the PLM server

    scp hscroot@hmchostname:.ssh/authorized_keys2 ~/.ssh/tmp_authorized_keys2
    

    Append the PLM server keys to the temporary key file and copy it back to the HMC:

    cat ~/.ssh/id_rsa.pub >> ~/.ssh/tmp_authorized_keys2
    scp ~/.ssh/tmp_authorized_keys2 hscroot@hmchostname:.ssh/authorized_keys2
    


    Test SSH and enable WebSM

    Test SSH to the HMC. You should not be asked for a password.

    ssh hscroot@hmchostname lssyscfg -r sys
    

    On the PLM server, make sure you can run WebSM:

    /usr/websm/bin/wsmserver -enable
    


    Configure the PLM Server

    On the PLM server, open WebSM and select Partition Load Manager.

    Click on ghe Globals tab and enter the fully qualified hostname of your HMC. Enter "hscroot" as the HMC user name. Enter the CEC name, which can be obtained by running the following command on the PLM server:

    ssh hscroot@hmchostname lssyscfg -sys -F name
    

    Select the system name that corresponds to the frame you are configuring in the PLM server and enter this as the CEC name.

    Click on the Groups tab and add the groups "dedicated" and "shared". The maximum values should be the total amount of CPU and memory on the frame being configured to be managed by the PLM. Click on CPU and memory management to manage both.

    Click on the partitions tab and add all the LPAR's on the frame to be managed by the PLM. Use the fully qualified domain name as the partition name for each LPAR.

    Click on OK to create the policy file and verify it's existance on the PLM server under "/etc/plm/policies"

    From the WebSM interface of the PLM, perform the PLM setup. NOTE: You must be logged into the PLM server through the WebSM interface as "root" to perform this step.


    Test RMC Authentication

    Test RMC authentication by running the following command from the PLM server, where "plm_client_name" is the hostname of the LPAR that will be managed by PLM.

    CT_CONTACT=<plm_client_NAME>  lsrsrc  IBM.LPAR
    

    If successful, several lines of LPAR information will be printed out instead of "Could not authenticate user".


    Start the PLM Server

    From the WebSM interface of the PLM server, start the PLM server. Enter the full path file name of the policy file name. The full path file name of the policy file will be the directory "/etc/plm/policies" followed by the serial number of the frame. Any alphabetic characters in the serial number must be entered in UPPERCASE letters. For example:

    /etc/plm/policies/10F6BEE
    

    Also enter the full path file name of a log file where the PLM will store activity information. Several utilities are dependent upon the information contained within the log file so it is important that this log file be created in the correct directory with the correct name. The log file directory is "/var/opt/plm" and the log file name is the serial number of the frame followed by ".log". Any alphabetic characters in the serial number must be entered in UPPERCASE letters. For example:

    /var/opt/plm/10F6BEE.log
    

    NOTE: You may have to "touch" the logfile before starting the PLM Server


    Troubleshooting

    If the PLM server does not start, check the PLM server file "/var/ct/cfg/ctrmc.acls" to ensure the following lines are at the bottom of the file:

    IBM.LPAR
    	root@hmcHostname	*	rw
    

    NOTE: Even though there is no access to the "root" user on the HMC, this line should still reference "root@hmcHostname".

    On the PLM client LPAR check the same file "/var/ct/cfg/ctrmc.acls" to ensure the following lines are at the bottom of the file. Recognize the last line of this file on a PLM client LPAR will reference the PLM Server hostname rather than the HMC hostname:

    IBM.LPAR
    	root@plmServerHostName	*	rw
    

    If you edit, the "/var/ct/cfg/cgrmc.acls" file on the PLM server or on a PLM client LPAR, restart the RMC subsystem on the modified systems.

    refresh -s ctrmc
    


    Troubleshooting

    If the PLM server still does not start, there is most likely an RMC authentication problem. Begin by obtaining a list of trusted hosts by running the following command on the PLM server:

    /usr/sbin/rsct/bin/ctsvhbal
    

    One or more identities of the PLM client LPAR should appear in this list. If not you may need to rerun the PLM Setup. This can be performed from the WebSM interface or from the command line on the PLM server. The command line is:

    cd /etc/plm/setup
    ./plmsetup <plmClientHostName> root
    

    On the PLM client LPAR check the list of trusted hosts by running the following command:

    /usr/sbin/rsct/bin/ctsthl -l
    

    The PLM Server host name should appear in this list. If multiple identities exist, it is usually a good idea to remove them all and rerun the PLM setup command on the PLM server. To remove the trusted host identities on a PLM client LPAR, run the following command:

    /usr/sbin/rsct/bin/ctsthl -d -n <hostname>
    

    Trusted host identities can be added on the PLM server or client LPAR's using the following command:

    /usr/sbin/rsct/bin/ctsthl -a -n <hostname> -m rsa512 -p <identifier>
    

    Where the <identifier> can be obtained by running ctsthl -l on the opposite system to determine it's value.


    Troubleshooting

    One problem that was encountered with the PLM server was when using the WebSM interface and clicking on the link labeled "Show LPAR Statistics", a dialog window would appear filled with java errors, and the statistics screen would not start. This was apparently due to a formatting problem with the policy file itself. However the PLM server will start and there are no obvious errors other than the inability to click on the "Show LPAR Statistics" link.

    The fix for this problem is to delete the Policy file and create a new one.

    -
    PLM Config
    -
     


    FREE Domain Registration
    included with Web Site Hosting
    Tools, Social Networking, Blog

    www.siteox.com

    Business Web Site Hosting
    $3.99 / month includes Tools,
    Shopping Cart, Site Builder

    www.siteox.com